BARRIERS AND LOCAL MINIMA IN ENERGY LANDSCAPES 
OF STOCHASTIC LOCAL SEARCH 



PETTERI KASKI 



O 
O 

(N 

> 
O 



Abstract. A local search algorithm operating on an instance of a Boolean constraint satisfaction 
problem (in particular, fc-SAT) can be viewed as a stochastic process traversing successive adjacent 
states in an "energy landscape" defined by the problem instance on the n-dimensional Boolean 
hypercube. We investigate analytically the worst-case topography of such landscapes in the context 
of satisfiable fc-SAT via a random ensemble of satisfiable "fc-regular" linear equations modulo 2. 

We show that for each fixed k = 3,4, . . ., the typical fc-SAT energy landscape induced by an 
instance drawn from the ensemble has a set of 2*^'"^ local energy minima, each separated by an 
unconditional i}{n) energy barrier from each of the 0(1) ground states, that is, solution states 
with zero energy. The main technical aspect of the analysis is that a random fc-regular 0/1 matrix 
constitutes a strong boundary expander with almost full GF(2)-linear rank, a property which also 
enables us to prove a 2'^'-"^ lower bound for the expected number of steps required by the focused 
random walk heuristic to solve typical instances drawn from the ensemble. These results paint a 
grim picture of the worst-case topography of fc-SAT for local search, and constitute apparently the 
first rigorous analysis of the growth of energy barriers in a random ensemble of fc-SAT landscapes 
as the number of variables n is increased. 
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1. Introduction 

1.1. Background and Motivation. Stochastic local search algorithms [21 [55] have in practice 
proven to be surprisingly efficient in solving instances of difficult constraint satisfaction problems 
(see \10\ [93] for recent examples). Yet the basic analytical principles underlying the success or 
failure of local search heuristics are far from being understood. 

The objective of the present work is to shed new analytical light into the combinatorial phe- 
nomena that can occur in "energy landscapes" [85] governing the operation of most local search 
algorithms used in practice. Indeed, the difficulty in analyzing even the most elementary heuristics 
largely stems from the fact that the energy landscapes induced by the problem instances do not 
easily yield to combinatorial analysis. 

To set the stage, certainly among the most well-understood settings for constraint satisfaction 
problems is a system of linear equations Ax = b (mod 2) over n variables xi,X2, ■ ■ ■ ,Xn assuming 
0/1 values (that is, "XORSAT"). The following example will provide to be illustrative. 
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A local search algorithm can now be viewed as a stochastic process that traverses a sequence 
of adjacent states in the energy landscape associated with the problem instance. For a linear 
system Ax = b (mod 2), the states of the landscape consist of the 2" possible assignments s = 
(si, S2, ■ ■ ■ , Sn) of 0/1 values to the variables xi,X2, ■ ■ ■ , Xn- Any two states are adjacent if they 
differ in the value of exactly one variable; the distance between two states is the number of variables 
having different values in the two states. Associated with each state s is an energy E{s) equal 
to the number of equations violated by the assignment x = s. For example, with lines indicating 
adjacency and energy indicated by subscripts, the landscape associated with ([T]) is depicted below. 
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The "simple" setting of linear equations is motivated because it provides direct insight into land- 
scape phenomena in less tractable settings, in particular, in the context of the /^-satisfiability 
problem (/c-SAT) [31]. Indeed, a linear equation with k variables is logically equivalent to a 
conjunction of 2^^"^ SAT clauses of length k that exclude the 0/1 assignments violating the equa- 
tion. Furthermore, assuming that energy in SAT is defined as the number of violated clauses, the 
landscape of the SAT encoding of Ax = b (mod 2) is identical to the linear landscape. Thus, any 
landscape phenomenon that occurs in the context of linear equations also occurs in SAT. 

In the present work we seek to understand what an energy landscape "can look like" to local 
search heuristics, in the worst case. The two standard heuristics that occur in most local search 
algorithms are: (a) energy bias — the algorithm prefers (in probability) moving into adjacent states 
with lower energy over those with higher energy; and (b) focusing — the algorithm prefers moving 
into adjacent states such that the move affects the constraints that are violated in the current state. 

Exerting an energy bias does not always guide a search towards a solution, as can be immediately 
seen from ([2]). To study the worst-case extent of this phenomenon, we consider two standard 
combinatorial measures of "ruggedness" in a landscape: (a) the local minimum states, that is, 
the states with positive energy whose adjacent states all have strictly higher energy, and (b) the 
global energy barrier separating a state s from a state t, that is, the minimum increase in energy 
over E{s) required by any walk from s to t consisting of successive adjacent states. Of special 
interest are the barriers separating local minima from ground states, that is, the zero-energy 
solution states. For example, in ([2]) the local minimum states are 1110, 1101, 1011, and 0111, each 
separated by a barrier of 3 — 1 = 2 from the unique ground state 0000. 

From the perspective of the focusing heuristic, a benchmark algorithm is the focused random 
walk jSlj (in each step, select uniformly at random one violated constraint, and flip the value of 
one variable selected uniformly at random among the variables occurring in the constraint). Also 
focusing can perform poorly, as can be seen by considering the transition probabilities in ([T]) and 
([2]) for the focused random walk. 

The subsequent analysis paints a grim picture of the worst-case topography that heuristics face 
already in the "simple" case of /c-regular linear equations, and hence, in the case of fc-SAT. The 
present results constitute apparently the first rigorous topographical analysis of the energy land- 
scapes induced by a nontrivial random ensemble. (See ^1.31 for a discussion of related work.) 

1.2. Statement of Results. Throughout this work we assume that k = 3,4,... is fixed. In 

particular, any asymptotic notation O(-), ^^(•); o(-) always refers to the parameter n growing without 
bound and k remaining fixed. Furthermore, the constants hidden by the asymptotic notation in 
general depend on the fixed parameters, such as k and e in Theorem [H 

An n X n matrix with 0/1 entries is /c-regular if every row and every column has exactly k 
nonzero entries. For a given n, a random A;-regular matrix refers to a /c-regular n x n matrix 
selected uniformly at random from the set of all such matrices. Similarly, a random A;-regular 
landscape refers to the energy landscape associated with a system Ax = (mod 2), where A is a 
random /c-regular matrix. 

Theorem 1 (Energy barriers and local minima). For each fixed k = 3,4, .. . and e > it holds 
that a random k-regular landscape has with probability at least 1 — e the following three properties: 

(i) the number of ground states is 0{1); 

(ii) any two distinct ground states have distance il.{n) and are separated by an Q{n) energy 
barrier from each other; 

(iii) there exists a set of 2^^"'^ local minima such that each local minimum is separated by an 
ri(n) energy barrier from every ground state. 

Thus, an energy landscape can be very uneven indeed. Furthermore, Theorem [1] leaves no 
possibility for "trivial" barriers caused by large local fluctuations of energy. Indeed, because each 
variable occurs in k = 0(1) equations, it follows that moving from one state into an adjacent state 
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changes the energy by at most k units, implying that the extensive energy barriers are a global 
phenomenon apparently not easily circumvented with local heuristics. Due to the connection with 
fc-SAT, identical lower bounds hold for /c-SAT landscapes in the worst case. Interestingly, this 
worst-case phenomenon occurs at a ratio a = 2^^^ of clauses to variables, which is well below the 
SAT/UNSAT threshold H EJ |63] for the "random /c-SAT" [SIESIED] ensemble. 

Also the focused random walk can be shown to fail systematically for random fc-regular systems. 

Theorem 2 (Lower bound for focused random walk). For each fixed k = 6,7, . . . and e > it holds 
that the system Ax = (mod 2) defined by a random k-regular matrix A has with probability at 
least 1 — e the property that the focused random walk requires 2^^") expected steps to arrive at a 
ground state when started from an initial state selected uniformly at random. 

The main technical hurdle in establishing Theorems [T] and [2] is the following result, which we 
expect to be of independent interest (see §1.3p in particular due to its role in establishing the 
existence of strong A;-regular boundary expanders with almost full linear rank. 

Theorem 3. The expected size of the kernel of a random k-regular matrix over GF(2) is 0(1)- 

A matrix A is a (/c, a;, r7)-boundary expander if (a) the number of nonzero entries in every 
column is at most k, and (b) for a\l w = 1,2, ... , [lu\, every submatrix consisting of w columns 
of A has at least \rjw'] rows containing exactly one nonzero value. The following theorem is well 
known (cf. |53^ Theorem 4.16(2)]). 

Theorem 4. For each fixed k = 3, 4, . . . and 6 > there exists a (3 > such that a random 
k-regular matrix is a {k, (5n, k — 2 — 6)-boundary expander with probability 1 — o(l). 

[[ N.B. A proof of Theorem [5] is provided in Appendix[Xl ]] 

Applying Markov's inequality to Theorem [3] and combining with Theorem HI it follows that for 
each fixed k = 3, 4, . . ., 5 > 0, and e > there exist constants d > and /3 > such that with 
probability at least 1 — e a random /c-regular matrix both (a) has a kernel of size at most 2*^ and 
(b) is a {k, (5n, k — 2 — (5)-boundary expander. This provides the technical foundation for Theorems 
□ and El 

1.3. Connections and Related Work. Random ensembles of constraint satisfaction problems 
such as "random fc-XORSAT" [271 [Ml E] and "random A;-SAT" [211 [251 [70] have received 
extensive attention both from the computer science and the statistical physics communities [3l [33| 
H^l [52l [59| [68l [7T] . In particular, the random /c-XORSAT ensemble is by now well-understood as 
regards rigorous analysis of the transition phenomena as the ratio a of the number of equations to 
variables is increased |26[ [271 l32l [69] , and a similar rigorous foundation is emerging for random k- 
SAT [¥l[5l l41U65] . where the corresponding control parameter a is the ratio of the number of clauses 
to variables. The present work differs from these studies by (a) considering an essentially different 
random ensemble, and (b) focusing on the topography of the complete energy landscape, whereas 
most of the recent effort, e.g. O [651 [66l [76] 177] . in studies of random A;-XORSAT and random 
fc-SAT has gone to investigating "only" the distance distribution between the ground states akin to 
Theorem [Il^ii) . (An exception is [72] , where it is shown that in the limit n — > oo the energy barriers 
in random fc-XORSAT between nearby ground states are bounded from below by — dog(ad — a) 
for some constant C > as the control parameter a approaches the dynamical transition point 
Old [69].) The growth of energy barriers and local minima as a function of the system size n has 
apparently not been rigorously investigated in random ensembles until the present work. 

The structure of energy landscapes associated with local search algorithms and spin-glass models 
of statistical physics [181 EZ] have been the focus of many empirical and quasi-rigorous statistical- 
physics studies, e.g. [H] [201 EEl [371 ESI [96], however, rigorous results are more scarce. In this 
connection at least one result exists, namely in [78] it is shown that a deterministic 3-regular matrix 
family based on a triangular lattice has an associated landscape with local minima separated by an 
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0(log(n)) barrier from the ground state; a benchmark study of SAT-solvers using 3-SAT instances 
derived from this family is carried out in [57j- A general survey of combinatorial landscapes in 
various contexts is |85| . 

From the perspective of computer science and statistical physics, the "random satisfiable k- 
regular XORSAT" ("ferromagnetic A:-spin model with Ising spins and fixed connectivity A:") en- 
semble studied in the present work has apparently been the focus of only relatively few studies, 
despite the fact that the study of random fc-regular matrices (equivalently, random fc-regular bi- 
partite graphs with a fixed bipartition) has a long history in mathematics \n\ I99|. To the best of 
our knowledge, from a computational / statistical physics perspective the few works addressing the 
present ensemble are [73], where an analysis of the correlation times of the Glauber dynamics on a 
corresponding spin-glass model is carried out, and [U], where clausal encodings for the k = 3 case 
are used to empirically benchmark SAT-solvers; further experiments for the A: > 3 case are reported 
in [56] . Statistical physics studies on analogous fixed-connectivity models include [391 HOI EH EH] ■ 

From a mathematical perspective it is immediate that the analysis of fc-regular matrices over 
GF(2) is closely related to the study of low-density parity-check codes (LDPC codes) [421187] 
in coding theory. In coding-theoretic language. Theorem [3] states that the expected total number 
of codewords in a linear code defined by a parity-check matrix drawn from the /c-regular matrix 
ensemble is 0(1) (indicating that such codes have very limited applicability from a coding-theoretic 
perspective). Prom a methodological perspective, however, the tools used to analyze the average 
weight distribution of the codewords in standard LDPC code ensembles are analogous to the tools 
used to prove Theorem [3] (cf. [121 ED EH [Ml ISQl [83]), the main difference being that we want to 
bound the expected total number of codewords rather than the number of codewords with a specific 
relative weight, necessitating uniform upper bounds that enable summation over all the weights 
w = 0,1, . . . ,n. 

Theorems [1] and [2] are apparently the first results where expansion is employed in lower bound 
results aimed at understanding local search, despite the fact that expansion is a basic tool in numer- 
ous lower bound constructions in, e.g., proof complexity [71 1151 1^5 1 197|. where many constructions 
are based on clausal encodings of linear equations. In particular, the probabilistic full-rank bound- 
ary expander constructions in [HI [9] apparently provide an analogue of Theorem [1] in the special 
case k = 3; however, this is not immediate due to lack of regularity. In the converse direction, the 
present Theorem [3] and Theorem jj] imply (by stripping dependent rows and columns) the existence 
of full-rank boundary expanders for every k > 3, thereby providing partial progress to the open 
lower bound questions in [HI §5]. An interesting technical contrast to the present lower bound 
results is that the upper bound for the focused random walk in [6] also relies on typical expansion 
properties of random 3-SAT instances. A recent survey of expansion and its applications is [53| . 

A large number of stochastic local search algorithms for the /c-SAT problem are based on varia- 
tions and combinations of the energy bias and focusing heuristics. Arguably the two central algo- 
rithm families in this respect are (a) algorithms in the "WalkSAT family" |621 IM] (e.g. [54', 'HP, '921 
l95]). and (b) algorithms based on variations of the Metropolis dynamics [M] (e.g. [inil23l,58, 93j). 
(The recent survey propagation algorithm [191 [6HI [75] for random /c-SAT also employs local search, 
but only as a postprocessing step after a "global" form of belief propagation [111 161]-) 

Only relatively few rigorous upper and lower bound results are known for the running time of 
local search algorithms for /c-SAT. For the focused random walk with restarts, it is known [92] that 
a satisfying assignment in any satisfiable instance of A;-SAT is found in 0{n{2 — 2/A;)"') expected 
steps, k > 3. In [6] it is shown that the focused random walk finds a satisfying assignment in 
0{n) steps with high probability for a typical instance drawn from the random 3-SAT ensemble 
for a < 1.63. An exponential upper bound improving upon the trivial 0(2") is derived in [51] for 
a "cautious" randomized greedy approach. In terms of lower bounds, families of crafted instances 
whose solution requires an expected exponential number of steps of the focused random walk are 
known; see [6] and [82[ §11.5.6]. Explicit families of instances forcing exponential expected running 
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times for certain randomized greedy heuristics are constructed in [51| . Quasi-rigorous statistical 
physics studies considering local search heuristics include |13^ [96] . 

From the perspective of local search algorithms for A;-SAT, the present Theorem [2] apparently 
provides the first example of a nontrivial random ensemble with exponential lower bounds on 
the expected running time for the focused random walk. Furthermore, the energy barriers and 
local minima demonstrated in Theorem [TJiii) constitute a step towards rigorous lower bounds for 
more complex heuristics relying on a combination of energy bias and focusing. In this regard 
the subsequent proof of Theorem [2] actually provides a meager first step — for large enough k it 
is immediate from ()20p that a comparably "small" energy bias is insufficient to overcome the 
systematic drift away from ground states caused by focusing and expansion. 

As regards energy bias heuristics alone, the convergence properties of nonfocused variants of the 
Metropolis dynamics (simulated annealing |23[ l58] in particular) have been extensively analyzed; 
see [H EH [301 HBl [HI [90] and the references therein. However, these analyses typically adopt a 
worst-case setting necessitating that a ground state is found with significant probability from every 
possible initial state. To arrive at a rigorous analysis of the typical behavior from a random initial 
state akin to Theorem [21 a study of the landscape structure beyond the properties in Theorem [T] 
is apparently required. In particular, the structure of the attraction basins (see [37]) of the local 
minima in Theorem [TJ^iii) in relation to the attraction basins of the ground states need to be better 
understood. 

1.4. Organization. The remainder of this work is organized as follows. The conventions and 
mathematical preliminaries are reviewed in ^ Theorem [3] is proved in ^ Theorems [1] and [2] are 
proved in 21 

1.5. Acknowledgments. The author would like to thank Mikko Alava, Pekka Orponen, and 
Sakari Seitz for useful discussions, and Jukka Kohonen for insight with the proof of Lemma [TOl 
This research was supported in part by the Academy of Finland, Grant 117499. 

2. Preliminaries 

2.1. Conventions. A vector always refers to an n-dimensional column vector with elements in 
the finite field GF(2) = {0, 1}. All arithmetic on vectors is over GF(2). For j = 1, 2, . . . , n, denote 
by Cj the standard basis vector with the j'th element equal to 1 and all other elements equal to 
0. A state is a synonym for vector when landscapes are discussed. The weight W{u) of a vector 
u is the number of nonzero elements. In accordance with the definitions in §1.11 the energy of a 
state s with respect to the system Ax = (mod 2) is defined by E{s) = W{As). The distance 
between states s and t is D{s,t) = W{s + t). A state s is a local minimum if E{s) > and 
E{s + Ej) > E{s) holds for all j = 1, 2, . . . , n. 

2.2. Asymptotics. All logarithms are to the natural base exp(l) = X^^q 1/fc!. We recall a variant 
|89j of Stirling's formula, valid for all positive integers n, 

, . 1 / , / N log(n) 1 \ , 1 f , , , log(n) 1 \ 

V '^"^ " " + ^ + J ^ ^ 71^ r " " ^ ^ ^ J • 

For < A < 1, define the entropy function H{\) = — Alog(A) — (1 — A)log(l — A). From ^ we 
have the following upper bounds for the binomial coefficients, valid for all integers n,k > 3 and 
w = 1,2, . . . ,n — 1: 

In what follows we require asymptotic approximations for coefficients of large powers of certain 
polynomials. For a polynomial P{z), denote by [z^]{P(z)} the coefficient of the term in P{z). 
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For example, + Qz'^ + z^} = 6 and [z]\^l + 3z^| = 0. The following theorem is a well-known 

"local limit analogue" [161I43| of the central limit theorem in probability theory; see \36\ Chap. IX]. 

Theorem 5 (Local limit law for coefficients of a polynomial power). Let P{z) he a polynomial of 
degree d>l with a positive constant term and positive coefficients such that the greatest common 
divisor of the degrees of the nonzero terms of P{z) is 1, let 

P'(l) 2 2 

and let < 5 < 2/3. Then, for all large enough n, it holds uniformly for all integers of the form 
N = fin + u with \u\ < that 

(5) [z^] } = -j^P{ir exp (-1^) (1 + • 

[[ N.B. A proof of Theorem [5] is provided in Appendix [Bj ]] 

2.3. The Configuration Model for fc-Regular Matrices. For integers n > A: > 3, let X and 

Y be two A;n-element sets of points, both of which are partitioned into n cells of k points each. 
A (k, n)-configuration is a bijection ^ : X ^ Y. Denote by 2 the set of cells in X and by the 
set of cells in Y. Associated with a {k, n)-configuration 7 there is a n x n integer matrix A = {ajj) 
defined for all I £ I and J £ J' hy ajj = \{i £ I : 7(i) £ J}\- We clearly have ^jajj = k for all 
I £ I and ajj = k for all J £ J . K configuration is simple if A is a 0/1 matrix. 

The following theorem is due to Bekessy, Bekessy, and Komlos |14] and O'Neil |79| ; early related 
results are due to Erdos and Kaplansky [3lj and Read 



Theorem 6. A random {k,n) -configuration is simple with probability exp(— (fc — 1)^/2) +o(l). 

It is well known that any given A;-regular matrix is obtained from exactly (fc!)^'" simple {k,n)- 
configurations, enabling one to access the uniform distribution on the set of all /c-regular n x n 
matrices via the uniform distribution on the set of all simple {k, n)-configurations. 

Also considerable extensions of Theorem [6] are known, see [T71 [JSl HGJ [99] . 

3. Expected Size of The Kernel 

We proceed with the proof of Theorem [3l 

Proof. By linearity of expectation, we can express the expected size of the kernel as a sum of 
expectations of 0/1 indicator variables, one indicator for each of the 2" vectors. The expectation 
of each indicator is equal to the probability of the corresponding vector occurring in the kernel. By 
symmetry, for each weight w = 0, 1, . . . ,n, all the (") vectors of weight w have equal probability 
of occurring in the kernel. Denote by Pk{n,w) the probability that a given vector x of weight w 
occurs in the kernel. 

We proceed to derive an upper bound for Pk{n,w) using the configuration model. We have 
that X occurs in the kernel of A if and only if the columns of A corresponding to the w nonzero 
coordinates of x form a submatrix with an even number of nonzero entries in every row. Let 
ei = 0, 2, . . . ,2[A;/2j be the number of nonzero entries in row i of this submatrix. Because A is 
/c-regular, Yll^=i ~ '^^^ number of simple {k, ?i)-configurations that induce an A meeting a 
given nonnegative even composition ei + 62 + . . . + = kw is at most {kw)l • {k{n — w))\ • nr=i ie) ■ 
To obtain an upper bound for the total number of simple {k, n)-configurations that induce an A 
with X in the kernel, let 



(6) E,{z) = (^)^'' = B,{n,w) = [z'^nlMzr}- 
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Now observe that the total number of simple (fc, n)-configurations that induce an A with x in the 
kernel is at most {kw)l • {k{n — w))\ • Bk{n, w), where Bk{n, w) in effect sums the product JlILi (e ) 
over all the eligible compositions ei + 62 + • • • + = kw. By Theorem [6l for all large enough n 
there are at least p ■ (kn)] simple (A;, n)-configurations, where p is any positive constant less than 

exp(-(A; - 1)^/2). We thus have the upper bound Pk{n,w) < p'^i^I^^' Bk{n,w). 

Taking the sum of Pk{n, w) over all vectors of weight w and all weights w = 0,1, . . . ,n, we have 
that the expected size of the kernel of a random /c-regular matrix of size n x n is at most p~^Sk{n), 
where 

S'(»)=i:(:)Q:) 

w=0 ^ ^ ^ -' 

The rest of this section provides an asymptotic analysis establishing that Sk{n) = 0(1). □ 

Theorem 7. Sk{n) = 2 + o(l) if k is odd and Sk{n) = 4 + o(l) if k is even. 

Proof. Partition the sum ([7]) into the following intervals: 

< w < n/{2k), (left extreme deviation) 

n/{2k) < w < {n — n'^/^)/2, (left large deviation) 

(8) (n - n^/^)/2 < w < {n + n^/^)/2, (central region) 

(n + n^/^)/2 < w < n{l — l/{2k)), (right large deviation) 
n(l — 1/(2A;)) < w < n. (right extreme deviation) 

Observe that Bk{n, -w) = if kw is odd. Furthermore, if k is even, we have Bk{n, w) = Bk{n, n — w) 
by symmetry of the binomial coefficients, implying that left and right regions are identical if k is 
even. If k is odd, then Ek has degree k — 1, implying that Bk{n, w) = Q for all w > {k — l)n/k and 
that the sum is zero in the right extreme region. 

Claim 8. The sum in the central region is 1 + o(l) if k is odd and 2 + o(l) if k is even. 

Proof. Using Theorem [5l we first derive Gaussian approximations to the terms (^) , (^^) , and 
Bk{n,w) in the central region. To this end, let 5 = 3/5. From the binomial theorem it follows that 
O ~ {(1 + -j)""} for nonnegative integers a, n, w. Setting P{z) = (1 + z)°', we have p = a/2 
and a = \fal2 in Theorem [5l We obtain that 

(9) f = -^2-+^ exp (- - Vl + o(l)) 



aw) yj2TTna \ 2n 

uniformly for all integers w in the central region (n — n^/^)/2 <w< (n + n^/^)/2. To approximate 
Bk{n,w), let P{z) = Ej^{y/z) and observe that P{z) is a polynomial meeting the requirements of 
Theorem [5] with p = k/4 and a = \/T;/4. We obtain 



(10) Bkin,w) 



^2('^-i)"exp(-MH!^)(l + o(l)) if fciz; is even, 
if kw is odd. 



uniformly for all integers w in the central region. From Q and (jlOp we have, 

'knY\, ^ /^exp(-(^) (1 + 0(1)) iffc..iseven, 
1 Bk{n,w) = / V 



n 

wj \kw 



if kw is odd. 



Thus, for w in the central region 

-1 



E(:)(::)"-^(-)^(-»('))^E-(-^) 



{'^ l-oo exp(-2s2) ds = 2, if k is even, 
|^/^exp(-2s2)(is = 1, if is odd. 



where the second equality fohows from the change of variables t = w — n/2, and the limit as n ^ oo 
follows from the observation that for w in the central region, t/y/n ranges over < t < 

^]^g halving when k is odd is due to the terms associated with odd w being zero if k is 
odd. □ 

Claim 9. The sum in the left and right large deviation regions is o(l). 

Proof. First we use an approximate variant of the saddle point method (see e.g. j361 Chap. VIII]) 
to derive an upper bound for i?fc(n, w). By Cauchy's coefficient formula, 



Bi.(n, w) = (B 



dz, 



where the integration contour can be taken to be a positively oriented circle of radius ^ > centered 
at the origin of the complex plane. Because Ej,{z) is a polynomial with positive coefficients, the 
integrand assumes its maximum modulus on the contour aX z = ^. Consequently, letting A = w/n, 

^^^^ ut w 1 / ^fc(er , Ek{S,T f EkiOV 

(11) Bkin,w)<—j>jj^dz = -^=^-^j. 

As an approximation to a saddle point contour, let = (A/(l — A))'^'^^^^/*^ and observe that 

tAfc 

Combining ([TTD and ^2il we have 



(:)(::)' ^-'"■-'^^( (.,,:,iiV-' )"K^)- 

Let T^-^ = ^. 

Lemma 10. For all t > it holds that Ek{T^^'^) < (l + t^)^ ^ , with equality if and only if t = 1. 

Proof. Recalling 1^ and using the binomial theorem, the inequality Ek{T^~^) < (l + t^)^ ^ is 
easily seen to be equivalent to 

In what follows we assume that e is an even nonnegative integer; in particular, if e is used as the 
index of summation, then it is assumed that e runs over all even nonnegative integers. Recalling 
that = ('^~^) + (jZi) fo'^ nonnegative integers k and j, it is straightforward to check that 
is equivalent to 



The e = terms cancel in (jlSp . so we may assume e > 0. To establish (jl5p . we show that 

e 



(16) 



holds for each e > 0, with equality if and only if r = 1. To this end, divide both sides of (jl6|) by 
(17) /(r) = (fc - e)r'= - A;r'=-^ + e > 0. 

Now observe that for e > we have /(O) = e > 0, /(I) = 0, and /(oo) = oo. Taking the derivative 
of /, the real zeroes of /'(r) = k{k — e)T^~^~^{T^ — 1) are —1, 0, and 1. Thus, for r > we have 
/(r) > 0, with equality if and only if r = 1. □ 

We now continue the proof of Claim [UJ Observe that A G (0,1) implies E (0, oo), with 
A = 1/2 if and only if ^ = r = 1. Thus, for w in the large deviation regions, that is, for X = w/n 

with n-2/5/2 < |A - 1/2| < {k - l)/(2fe), we have Ek{0/ {'^ + C''^^'''^'^)^'^ < 1 in ([l3]) by Lemma 
M Developing Efc(0/(l + C''/^''-^^) into a truncated Taylor series at A = 1/2 and evaluating 
at A = (1 ± n-2/5)/2, we obtain 



^1 + ^fe/{fc-l)) 



uniformly for all w in the left and right large deviation regions. The claim now follows from (jl3p 
and ([18]) because the regions have 0{n) summands. □ 

Claim 11. The sum in the left extreme deviation region is 1 + o(l). 

Proof. For w = the term in the sum is 1. For u; = 1, 2 the terms are 0{n'^n~^^n^^^'^). Thus, in 
what follows we may restrict to 3 < < n/{2k). Observe that 

Indeed, {''^^n-i"^) counts the number of integer compositions of kw into n even nonnegative parts, 

and (2) ^ provides an upper bound for the product nr=i ie) associated with each composition 
ei + e2 + ■ ■ ■ + en = kw into even nonnegative parts at most k. 
Observe by (gj) and ^ that 

-1 



(19) Bkin,w)< 



where 



log( ( Ij ] Bk{n,w) ) < Gk{n,w) + 0{1), 

Differentiating twice with respect to w, we have 

^„ , , kikn — l)w + nin — l){k — 2) 

Gi.(n,w) = ; ^ — ^ 

' ^ {n - w)w{kw + 2n - 2) 

is positive for < w < n, implying that Gk is convex in the extreme deviation region. In particular, 
Gk assumes its maximum at the boundaries of the region. Evaluating G^ at = 3 and w = n/{2k), 
we find Gk{n,w) < — 3/21og(n) + 0(1) uniformly for all w in the region. The claim follows because 
the number of terms in the region is 0{n). □ 

Combining the results for all the regions, we have that Sk{n) = 2 + o(l) if k is odd and Sk{n) = 
4 + 0(1) if k is even. This completes the proof of Theorem [71 □ 
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4. Topographical Properties 

Throughout this section we consider the landscape associated with a system Ax = (mod 2), 
where A is a fc-regular matrix of size n x n that both (a) has a kernel of size at most 2*^ and (b) is 
a {k, (3n, k — 2 — (5)-boundary expander, where < /? < 1/2, (i > 0, and < 5 < 1/3 are constants 
independent of n. 

4.1. Energy Barriers and Local Minima. The intuition underlying Theorem [1] is as follows. 
The boundary expansion property in effect "surrounds" a ground state with a "perimeter" of radius 
[Pn\ in the ?i-dimensional hypercube, where the energy ("wall") at every perimeter state is at least 
{k — 2 — 6) [Pn\ , so any state outside the perimeter with considerably lower energy has a considerable 
barrier separating it from the ground state. 

Let us now make this intuition formally precise and prove Theorem [TJ 

Property (i) is immediate by assumption. To establish property (ii), let gi and 52 be any two 
distinct ground states. Clearly, D{gi,g2) = W{gi + (72) > and Agi = Ag2 = 0. Thus, it follows 
from the boundary expansion property that D(gi,g2) = W{gi +52) > fin. (Indeed, we cannot have 
= W{A{gi +52)) > {k — 2 — 5)W{gi +52) > 0.) Thus, any walk of successive adjacent states from 
gi to 52 must have a "perimeter" state p with D{gi,p) = W{gi + p) = L/^raJ. By the boundary 
expansion property, 

E{p)-E{gi) = W{Ap)-W{Agi) = W{Ap) = W{A{gi+p)) > {k-2-6)W{gi+p) = {k-2-6)[(3n\. 

Since gi and g2 were arbitrary, we have thus established that distinct ground states are at distance 
Q{n) and separated by an 0(n) energy barrier. 

To establish property (iii), we first require a large enough set of local minima. With foresight, 
select any constant 7 such that 



< 7 < min 



P{k-2-5) 1 



2'^{k{k - 1) + 1) 



Because the kernel of A has dimension at most d, by elementary linear algebra there is a linearly 
independent set of n — d columns of A. Furthermore, A restricted to these columns has a linearly 
independent set of n — d rows. By permuting the rows if necessary, we can assume that these rows 
occur first in A. Applying Gaussian elimination to the selected n — d linearly independent columns, 
we find n — d vectors yi,y2, ■ ■ ■ ,yn-d with the property that Ayj = Cj + rj, where rj is a vector 
with the first n — d entries equal to 0, and ej is the jth vector in the standard basis. Observe that 
the vectors yi,y2, ■ ■ ■ , yn-d are linearly independent. 

We say that yj marks the rows that contain a 1 in A in at least one of the columns containing 
a 1 in row j. In other words, denoting by apq the entry of A at row p, column q, we have that yj 
marks the rows {i : 3g Oig = ajq = 1}. Observe that because A is fc-regular, each vector yj marks 
at most k{k — 1) + 1 rows. 

There are at most 2^^ different vectors rj. Thus, because d is a fixed constant independent of re, 
there exist at least (n — d)/2'^ vectors yj that have identical associated vectors Vj. Among these 
vectors, start selecting vectors one by one and marking associated rows subject to the constraint 
that no row is marked more than once, until no more vectors can be selected. Let m be the number 
of vectors selected in this way. Clearly, (re — d)/{2'^{k{k — 1) + 1)) < m < n. By re-indexing the 
vectors and permuting the rows and columns of A if necessary, we can assume that the selected 
vectors are 7/1, 1/2, • • • , l/m- 

We now claim that every state u of the form u = X^JLi XjUj with Xj S {0, 1} and X^JLi Xj = 
(mod 2) is a local minimum. To see this, observe first that Au = Y1Y=^ ^J^-J ™^ -^(^) — 
YlJLi Xj- Now the marking constraint implies that if we flip the value of any one variable in u, we 
satisfy at most one violated equation and introduce at least k — 1 new violated equations. Thus, u 
is a local minimum. 
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We proceed to construct an auxiliary graph that we eventually use to establish the energy barriers 
separating certain local minima from all the ground states. For j = 1, 2, . . . , m — 1, let = t/j + ym- 
Observe that the vectors zi,Z2, ■ ■ ■ ,Zm^i are linearly independent. Furthermore, the (™'^^) sums 
of the form Zi + Zj with l<i<j<m — 1 clearly satisfy W{A{zi + zj)) = 2. 

Because A is A;-regular, there are at most (2)^^ = 0{n) = 0{m) vectors y with W{Ay) = 2 and 
W{y) < 2/(/c — 2 — (5) < 3. To see this, observe that the two columns selected by any vector y with 
W{y) = 2 and W{Ay) = 2 must have at least one row containing a 1 in both columns, and the 
total number of such "ll"-patterns in A is (2)^^- Thus, by the expansion property ("^2^^) — 0{m) 
of the sums Zi + zj satisfy W{zi + Zj) > f3n. 

Form an auxiliary graph with the vertex set {zi, Z2, ■ ■ ■ , Zm-i} such that any two distinct vertices, 
Zi and Zj, are adjacent if and only if W{zi + Zj) < (5n. Because the number of edges in the auxiliary 
graph is 0{m), for all sufficiently large n the auxiliary graph has an independent set of size 2"^ + 1, 
which — by relabeling if necessary — can be assumed to consist of the vectors zi, Z2, ■ ■ ■ , Z2d^i. 

We now construct the local minima meeting property (iii). Let gi, §2, ■ ■ ■ , ds be the ground 
states, s < 2*^. Observe that any sum consisting of a subset of the linearly independent vectors 
zi,Z2, ■ ■ ■ , Zm^i is a local minimum. Furthermore, the energy of such a minimum is at most the 
number of summands plus one. Select any [771] of the vectors Z2d^2^ ^2^+3^ ■ ■ ■ > -Zm-i- (Note that 
for all sufficiently large n this is possible due to the choice of 7.) 

Consider now any state u formed as the sum of a nonempty subset of the \jn\ selected vectors. 
The energy of u is E{u) = W{Au) < \'yn\ + 1. The state u does not necessarily have extensive 
barriers separating it from each of the ground states. However, if the following condition holds, 
then u is separated by extensive barriers from the ground states. If the condition does not hold, 
then adding one of the (independent) vectors zi,Z2, ■ ■ ■ , Z2d_^i to u will produce a local minimum 
that is separated by extensive barriers from the ground states. 

Suppose that D{u,gj) > f3n/2 holds for all j = 1, 2, . . . , s. Thus, for every solution state gj, any 
walk from u to gj consisting of successive adjacent states must contain a "perimeter" state p at 
distance D{p,gj) = W{p + gj) = \(5n/2~\ By the boundary expansion property, the energy of p is 

E{p) = W{Ap) = W{Ap + Ag,) = W{A{p + g^)) >{k-2- 8)W{p + gj) > ik-2-5)Pn ^ 

In particular, the increase in energy at p compared with the energy of u is 

E{p) - E{u) = W{Ap) - W{Au) > (^-^-'^)/^^ _ - 1 > 7^. 

Thus, the energy barrier separating u from gj is at least 771, assuming that D{u,gj) > (3n/2 holds 
for all j = 1, 2, . . . , s. 

Suppose that D{u,gj) < f3n/2 holds for at least one j = 1, 2, . . . , s. Then, we claim that there 
exists at least one £ = 1, 2, . . . , 2*^ + 1 such that D{u + Z£, gj) > /3n/2 holds for all j = 1,2, . . . , s. 
To reach a contradiction, suppose that this is not the case. Then, by the pigeonhole principle, 
there exists a j = 1,2, ... ,s and 1 < £1 < £2 !^ 2*^ + 1 such that D{u + Zi-^,gj) < f3n/2 and 
D{u + ze^, gj) < (in/2. By the triangle inequality, D{u + zii^,u + zn^) < (3n, which by D{ze-^^,Z£^) = 
D{u + Zi-^ ,u+Z£^) contradicts the fact that zi, Z2, ■ ■ ■ , Z2d^i form an independent set in the auxiliary 
graph. Therefore, there exists at least one ^ = 1, 2, . . . , 2"^ + 1 such that D{u + zi, gj) > /3n/2 holds 
for all j = 1, 2, . . . , s. Applying the argument in the previous paragraph to the state u + z^, we 
have that u + z^ is separated from every solution state by an energy barrier at least 771. 

Because the vectors zi,Z2, ■ ■ ■ , Zm-i are linearly independent, we have thus established the exis- 
tence of at least 2"^^ — 1 distinct local minima, each separated from every ground state by an energy 
barrier of at least 777. This establishes property (iii). 

4.2. Lower Bound for the Focused Random Walk. The intuition underlying Theorem [2] is 
as follows. Consider any ground state g. In any state s ^ g with D{s,g) < [I3n\, the boundary 
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expansion property implies that most equations violated by s in the system Ax = (mod 2) have 
exactly one variable that assumes different values in s and g. In particular, the focused random 
walk is unlikely to flip this variable (there are k — 1 other choices), thereby exerting a systematic 
drift away from g. Thus, expansion in effect induces a region of entropic repulsion around every 
ground state. 

Let us now make this intuition formally precise and prove Theorem [2j As was demonstrated in 
^4.1^ all ground states in the landscape have distance at least [/3n] . Because /? < 1/2, it follows by 
standard tail bounds for the binomial distribution (see e.g. [171 §1]) that with probability 1 — 2"^^"^ 
the random initial state for the focused random walk has distance at least \Pn\ to each of the at 
most 2'^ ground states. It thus suffices to show that the expected number of steps to reach the 
ground state from the perimeter of a region of repulsion is 2^^"^ . To this end, let g be any ground 
state, and consider any state s ^ g with D{s,g) = W{s + g) < [(3n\. Because Ag = 0, the 
number of violated equations in s is E{s) = W{As) = W{A{s + g)) = E{s + g). By /c-regularity, 
E{s + g) < kW{s + g). By the expansion property and As = A{s + g), at least {k — 2 — 5)W{s + g) 
equations violated by s contain exactly one variable having a different value in s and g. Thus, 
assuming that the focused random walk is in the state s, the random step will increase the distance 
to (7 by 1 with probability at least 

{k-l){k-2-6)W{s + g) {k-l)ik-2-6)Wis + g) {k - l){k - 2 - 6) 
^ ^ kE{s) - km{s + g) - A;2 

otherwise the distance to g decreases by 1. For A; > 6 the probability (j2Up is at least 55/108, 
thereby establishing a systematic drift away from g for every state s ^ g with D{s,g) < [Pn\. A 
standard analysis of the gambler's ruin problem (see e.g. |35[ Chap. XIV]) with one absorbing ruin 
state and one reflecting barrier now establishes that, starting from a (reflecting) state s at distance 
D{s,g) > (in from each ground state g, the expected number of steps required to reach a ground 
(ruin) state is 2^^'^\ 
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Appendix 



This appendix is provided only for convenience of verification of the earher results. In particular, 
we stress that Theorems H] and [5] are well known; cf. \53\ Theorem 4.16(2)] and [50], [36^ Chaps. VIII 
and IX]. 



Appendix A. Proof of Theorem H] 

A matrix ^4 is a (/c, ?7)-expander if (a) the number of nonzero entries in every column is at 
most k, and (b) for all w = 1,2, ... , [uj\ , every submatrix consisting of w columns of A has at least 
\r]w'] rows containing at least one nonzero value. 

Theorem m follows immediately by combining the following two results. 

Lemma 12. Let A be a {k,u),r])- expander. Then, A is a {k,u!,2r] — k)-boundary expander. 

Proof. Consider any submatrix of A consisting of w of its columns, 1 < w < uj. Denote by q 
the number of rows with exactly i nonzero values in these columns. Because A is an expander, 
we have C = ci + C2 + ■ . . + Cw > Tjw and C" = ci + 2c2 + . . . + wc^ < kw. In particular, 
ci>2C-C' > (2?7 - k)w. □ 

Theorem 13. For every 6 > there exists an P > such that a random k-regular matrix is a 
{k, f3n, k — 1 — 5)-expander with probability 1 — o(l). 

Proof. Select a < (5 < 1. Let rj = k — 1 — 5 and 

(21) Uk(nw)=(^^( A {k[7]w\y. {kn- kw)\ _ /n\/ n \ / k[r]w \\ / kny^ 

' \w J \[r]w\J {k[riw\ — kw)\ {kn)l \W vL^'^J/ V J \kw J 

Recalling the configuration model from ^2.31 we claim that the probability that a random (k, n)- 
configuration does not define a (fc, /3n, 77)-expander is bounded from above by ^1^=1 Uk{n,w). To 
see this, observe that for every configuration violating the expansion property, there exists a set 
of w cells in J and a set of \r]w\ cells in I such that the kw points in the former set of cells are 
paired with points in the latter. There are {k\i]w\)\ / {k\r]w\ — kt)\ ways to pair the kw points, and 
{kn — kw)\ ways to pair the remaining points. 

We proceed to show that Xlio^i ^k{'n,w) = o(l) for an appropriate fixed /? > 0. For w = 
1,2,..., [2/5J we may view Uk{n, w) as a rational function of two polynomials of n. The denomi- 
nator polynomial has degree kw and the numerator polynomial has degree w+ [r]w\ = [{k — 6)w\ < 
kw-l. Thus, J2w=i Uk{n,w) = 0(l/n). Now let 

(22) Lk{n,w) = nH( +k7]wH( -] - {k-l)nH( - 

\ n J \r]J \n 

and observe by (jl]) and (f2T]) that logUk{n,w) < Lk{n,w) + 0(1) for all large enough n and 
w = \2/5\ + 1, \ 2/5\ + 2, . . . , \ n/{2r])\. Observe that Lk{n,2/S) < -21og(n) + 0(1) and that, 
differentiating ()22|) with respect to w and letting A = w/n, 

1\ ^ fX^il-rjX)'^ 



(23) L'k{n,w) = kriH(^ + log 



r?''(l - A) 



k~l 



Because the log-term in (|23|) decreases without bound as A — > 0"^, there exists an /3 > such that 
Lk{n,w) is decreasing as w = [2/5\ + 1, [2/5\ +2,..., [[in\. Thus, Uk{n,w) < 0(l/n) + 

j3n/{p? ■ 0(1)) = 0(l/n). The claim now follows from Theorem [6l □ 



Appendix p. 2 

Appendix B. Proof of Theorem [5] 
We require first a preliminary result. 

Theorem 14 (Saddle point asymptotics for coefficients of a polynomial power). Let P{z) he a 
polynomial of degree d > 1 with a positive constant term and positive coefficients such that the 
greatest common divisor of the degrees of the nonzero terms of P{z) is 1, and let A he any compact 
subinterval of the open interval (0, d). Then, for all large enough n, it holds uniformly for all 
integers N = Xn with A G A that 

(24) [z^]|P(z)"|= , Tr^(l + o(l)), 

where K\{z) = \og{P{z)) — Alog(z) and ^ = ^(A) is the unique positive solution of K'^{£,) = 0. 

Proof. It follows from the assumptions on P that ^ exists and is unique for every A G A. Applying 
Cauchy's coefficient formula on the circular contour z{9) = ^exp(i^), — vr < 9 < tt, we have 

^^^^'^ ^ - 2m y_ (eexp(.^))^+i^^'^P^^'^''' - 2^ J ^ (eexp(z^))A- 

By the assumptions on P, the modulus of P{z) assumes its maximum value on the contour if and 
only if z = ^. Thus, as n increases, the neighborhood of ^ produces the (exponentially) dominant 
contribution to the integral. In particular, letting 9o = n~^/^, we have 



= ^(1 +"(1)) /° H^fP^"" = |___exp(„A-.«expW))<i«. 

Assuming that n is large enough, Kx is analytic on every straight line segment connecting ^ to 
^exp{i9) with -9o < 9 < 9q. Thus, we have (see e.g. [Ml §7.1]), 

(eexp(i0)-O' 



/ {t-lfK'^{C + t{^eMiO)-0)dt. 
Jo 



By assumption we have that K'{^) = 0. Furthermore, the last integral and ^ are bounded when 
A G A. Thus, using exp(i9) = l+i9+0{n~^^^) for the second-order term and exp(i^) = l+0(n~^/^) 
for the coefficient of the integral, we have 

Kxi^expm = KxiO - + 0(n-6/5) 

uniformly for —9q<9<9q. It follows that 

[z^]{P(z)"} = -(1 + 0(1))^ J ^ eM-nK'me0V2)d9 
(1 + 0(1))^ / eM-KmetV^) dt 



where the second equality follows from the change of variables 9 = t/^/n and the last equality 
follows from exp(— t^) dt = y^. □ 

We now proceed with the proof of Theorem [5l 



Appendix p. 3 

Proof. Observe that < fj, < d and let A be any compact subinterval of (0, d) with ^ in its 
interior. We proceed to apply Theorem 1141 with sufficient approximations to ^, K'^{^), and P{^)/^^ 
as n increases and A = N/n = fi + u/n with |A — /i| < n^~^. First, observe that A = implies 
^(A) = 1. Developing ^(A) into a truncated Taylor series at X = fi using the defining equality 
^(A)P'(^(A)) = AP(^(A)), we have, after some calculation, uniformly 

(25) C(A) = 1 + e'(/u)(A - ^i) + 0(n2('5-i)) = 1 + ^ + ©(n^^^-D) . 

Some more calculation gives the uniform approximations 

and 

p„ . p„) , o(„=<-..) . PC) (i . -'/'^-'">^ . 

The approximations ([25]), (l26|), and (l27|) apphed to ([24)) establish (P. □ 
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